48 research outputs found
AdaSwarm: Augmenting Gradient-Based Optimizers in Deep Learning with Swarm Intelligence
This paper introduces AdaSwarm, a novel gradient-free optimizer which has similar or even better performance than the Adam optimizer adopted in neural networks. In order to support our proposed AdaSwarm, a novel Exponentially weighted Momentum Particle Swarm Optimizer (EMPSO), is proposed. The ability of AdaSwarm to tackle optimization problems is attributed to its capability to perform good gradient approximations. We show that, the gradient of any function, differentiable or not, can be approximated by using the parameters of EMPSO. This is a novel technique to simulate GD which lies at the boundary between numerical methods and swarm intelligence. Mathematical proofs of the gradient approximation produced are also provided. AdaSwarm competes closely with several state-of-the-art (SOTA) optimizers. We also show that AdaSwarm is able to handle a variety of loss functions during backpropagation, including the maximum absolute error (MAE)
LogGENE: A smooth alternative to check loss for Deep Healthcare Inference Tasks
Mining large datasets and obtaining calibrated predictions from tem is of
immediate relevance and utility in reliable deep learning. In our work, we
develop methods for Deep neural networks based inferences in such datasets like
the Gene Expression. However, unlike typical Deep learning methods, our
inferential technique, while achieving state-of-the-art performance in terms of
accuracy, can also provide explanations, and report uncertainty estimates. We
adopt the Quantile Regression framework to predict full conditional quantiles
for a given set of housekeeping gene expressions. Conditional quantiles, in
addition to being useful in providing rich interpretations of the predictions,
are also robust to measurement noise. Our technique is particularly
consequential in High-throughput Genomics, an area which is ushering a new era
in personalized health care, and targeted drug design and delivery. However,
check loss, used in quantile regression to drive the estimation process is not
differentiable. We propose log-cosh as a smooth-alternative to the check loss.
We apply our methods on GEO microarray dataset. We also extend the method to
binary classification setting. Furthermore, we investigate other consequences
of the smoothness of the loss in faster convergence. We further apply the
classification framework to other healthcare inference tasks such as heart
disease, breast cancer, diabetes etc. As a test of generalization ability of
our framework, other non-healthcare related data sets for regression and
classification tasks are also evaluated
Interfacial control of vortex-limited critical current in type-II superconductor films
In a small subset of type-II superconductor films, the critical current is determined by a weakened Bean-Livingston barrier posed by the film surfaces to vortex penetration into the sample. A film property thus depends sensitively on the surface or interface to an adjacent material. We theoretically investigate the dependence of vortex barrier and critical current in such films on the Rashba spin-orbit coupling at their interfaces with adjacent materials. Considering an interface with a magnetic insulator, we find the spontaneous supercurrent resulting from the exchange field and interfacial spin-orbit coupling to substantially modify the vortex surface barrier, consistent with a previous prediction. Thus, we show that the critical currents in superconductor-magnet heterostructures can be controlled, and even enhanced, via the interfacial spin-orbit coupling. Since the latter can be controlled via a gate voltage, our analysis predicts a class of heterostructures amenable to gate-voltage modulation of superconducting critical currents. It also sheds light on the recently observed gate-voltage enhancement of critical current in NbN superconducting film